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Abstract of CA2306527 

A method and user interface which allow users to 
make decisions about how to pronounce words 
and parts of words based on audio cues and 
common words with well known pronunciations. 
Users input or select words for which they want 
to set or modify pronunciations. To set the 
pronunciation of a given letter or letter 
combination in the word, the user selects the 
letters and is presented with a list of common 
words whose pronunciations, or portions thereof, 
are substantially identical to possible 
pronunciations of the selected letters. The list of 
sample, common words is ordered based on 
frequency of correlation in common usage, the 
most common being designated as the default 
sample word, and the user is first presented with 
a subset of the words in the list which are most 
likely to be selected. In addition, the present 
invention allows for storage in the dictionary of 
several different pronunciations for the same 
word, to allow for contextual differences and 
individual preferences. 
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GRAPHICAL USER INTERFACE AND METHOD FOR MODIFYING 
PRONUNCIATIONS IN TEXT-TO-SPEECH AND SPEECH RECOGNITION 

SYSTEMS 

Ahstrtct 

5 A method and user interfece which allow users to make decisions about how to 

pronounce words and parts of words based on audio cues and common words with well 
known pronunciations. Users input or sdect words for which they want to set or modify 
pronunciations. To set the pronunciation of a given letter or letto- combination in the 
word, the user selects the letto^ and is presented with a list of common words whose 

10 pronunciations, or portions thereof arc substantially identical to possil>le pronunciations 
of the selected letters. The list of sample, common words is ordered based on frequency 
of correlation in common usage, the most conmion being designated as the default 
sample word, and the user is first presented with a subset of the words in the list whidi 
are most likely to be sdected. In addition, the present invention allows for storage in the 

15 dictionary of sev^ different pronunciations for the same word, to allow for contextual 
differences and individual preferences. 
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GRAPHICAL USER INTERFACE AND METHOD FOR MODIFYING 
PRONUNCIATIONS IN TEXT-TO-SPEECH AND SPEECH RECOGNITION 

SYSTEMS 

Copyright Notice 

A portion of the disclosure of this patent document contains material which is 
subject to copyright protection. The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent disclosure, as it appears in 
the Patent and Trademark Office patent files or records, but otherwise reserves all 
copyright rights whatsoever. 

Background Of The Invention 

The invention disclosed herein relates generally to user interfaces and, in 
particular, to graphical user interfaces for use with text-to-speech and automated speech 
recognition systems. 

Voice or speech recognition and generation technology is gaining in importance 
as an alternative or supplement to other, conventional input and output devices. This 
will be particularly true with continued improvements and advances in the underlying 
software methodologies employed and in the hardware components which support the 
processing and storage requirements. As these technologies become more generally 
available to and used by the mass market, improvements are needed in the techniques 
employed in initializing and modifying speech recognition and generation systems. 

A few products exist which allow users to process files of text to be read aloud 
by synthesized or recorded speech technologies. In addition, there are software products 
used to process spoken language as input, identify words and commands, and trigger an 
action or event. Some existing products allow users to add words to a dictionary, make 
modifications to word pronunciations in the dictionary, or modify the sounds created by 
a text-to-speech engine. 
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However, \isers of these products are required to understand and employ 
specialized information about grammar, pronunciation, and linguistic rules of each 
language in which word files are to be created. Moreover, in some of these products the 
means of representing pronunciations requires mastery of a mark-up language with 
5 unique pronunciation keys not generally used in other areas. 

As a result, these products make text-to-speech and automated speech 
recognition technology inflexible and less accessible to the general public. They require 
users to become experts in both linguistic rules and programming techniques. The 
inflexibility arises in part because these products use general rules of the language in 
10 question to determine pronimciation without regard to context, such as geographic 
context in the form of dialects, or individual preferences regarding the pronunciation of 
certain words such as names. 

Further, the existing products generally provide less than satis&ctory results in 
pronunciations or translations of pronunciations. The products do not perform well with 
15 respect to many types of words including acronyms, proper names, technological terms, 
trademarks, or words taken from other languages. Nor do these products perform 
particularly well in accounting for variations in pronunciations of words depending on 
their location in a phrase or sentence (e.g., the word "address" is pronounced differently 
when used as a noun as opposed to a verb). 

20 As a result, there is a need for a user interface method and system which 

expresses pronunciation rules and options in a simple way so that nonexpert users can 
take fuller advantage of the benefits of text-to-speech and speech recognition 
technologies. 

Summary Of The Invention 



25 It is an object of the present invention to solve the problems described above 

with existing text-to-speech and speech recognition systems. 
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It is another object of the present invention to provide a simple and intuitive user 
interface for setting and modifying pronunciations of words. 

It is another object of the present invention to provide for the use in text-to- 
speech and speech recognition systems of sounds or letter groups which are not typically 
S used in or even violate the rules of a language. 

These and other objects of the invention are achieved by a method and user 
interface which allows users to make decisions about how to pronounce words and parts 
of words based on audio cues and common words with well known pronunciations. 

Thus, in some embodiments, users input or select words for which they want to 
10 set or modify pronunciations. To set the pronunciation of a given letter or letter 
combination in the word, the user selects the letter(s) and is presented with a list of 
common words whose pronunciations, or portions thereof, are substantially identical to 
possible pronunciations of the selected letters. Preferably the list of sample, common 
words is ordered based on frequency of correlation in common usage, the most conmion 
IS being designated as the default sample word, and the user is first presented with a subset 
of the words in the list which are most likely to be selected. 

In addition, embodiments of the present invention allow for storage in the 
dictionary of several different pronunciations for the same word, to allow for contextual 
differences and individual preferences. 

20 Further embodiments provide for the storage of multiple dictionaries for different 

languages, but allow users to select pronunciations from various dictionaries to account 
for special words, parts of words, and translations. As a result, users may create and 
store words having any sound available to the system, even when the sound doesn't 
generally correspond with letters or letter groups according to the rules of the language. 



25 



In addition to modifying pronunciations of letters in a word, embodiments of the 
present invention allow users to easily break words into syllables or syllable-like letter 
groupings or word subcomponents even when the rules of a given language do not 
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provide for such groupings as syllables, and to specify which such syllables should be 
accented. As used herein, the word syllable refers to such traditional syllables as well as 
other groupings. 

Briyf Dff^pptipq Qf Th^ Prawingj 

5 The invention is illustrated in the figures of the accompanying drawings which are 

meant to be exemplary and not limiting, in which like references refer to like or 
corresponding parts, and in which: 

Fig. 1 is a block diagram of a system in accordance with embodiments of the 
present invention; 

10 Fig. 2 is a flow chart showing broadly a process of allowing users to modify 

word pronunciations in accordance with the present invention using the system of Fig. 1; 

Figs. 3 A-3B contain a flow chart showing in greater detail the process of 
allowing users to modify word pronunciations in accordance with an embodiment of the 
present invention; 

15 Fig. 4 is a flow chart showing a process of testing the pronunciation of a word; 

and 

Fig. S-9 contain diagrams of screen displays showing the graphical user interface 
of one embodiment of the present invention. 

Detailed Description Of The Preferred Embodiments 

20 Embodiments of the present invention are now described in detail with reference 

to the drawings in the figures. 

A text to speech ("TTS") and automated speech recognition ("ASR") system 10 
is shown in Fig. 1 , The system 10 contains a computerized apparatus or system 12 
having a microcontroller or microprocessor 14 and one or more memory devices 16. 
25 The system 10 fiirther has one or more display devices 18, speakers 20, one or more 
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input devices 22 and a microphone 24. All such components are conv^tional and 
known to those of skill in the art and need not be further described here. 

Memory device or devices 16, which may be incorporated in computer apparatus 
1 2 as shown or may be remotely located from computer 1 2 and accessible over a 
network or other connection, store several programs and data iiles in accordance with 
the present invention. A pronunciation selection program 26 allows, when executed on 
microcontroli^ 14 for the generation of the user interface described herein, the 
processing of a user's input and the retrieval of data from databases 28 and 30. 
Dictionary databases 28 are a number of databases or data files, one for each language 
handled by the system 10, which store character strings and one or more pronunciations 
associated therewith. Pronunciation databases 30 are a number of databases or data 
files, one for each of the languages, containing records each having a character or 
character group and a number of sample words associated therewith which contain 
characters that are pronounced in a manner which is substantially identical to the way the 
characters may be pronounced. The sample words are selected in creating the 
pronunciation databases 30 based on grammatical and linguistic rules for the language. 
Preferably, the sample words for each character or character group (e.g., dipthong) are 
ordered generally from more common usage in pronunciation of the character to less 
common. 

Although shown as two databases, the dictionary database 28 and pronunciation 
database 30 may be structured as one data file or in any other format which facilitates 
retrieval of the pronunciation data as described herein and/or which is required to meet 
the needs of a given application or usage. 

The system 10 further contains a TTS module 32 and ASR module 34 stored in 
memory 16. These modules are conventional and known to those of skill in the art and 
include, for example, the ViaVoice® software program available from IBM. These 
modules 32 and 34 convert text stored as digital data to audio signals for output by the 
speakers 20 and convert audio signals received through microphone 24 into digital data. 
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The modules retrieve and utilize pronunciation data stored in the dictionary databases 
28. 

A method for allowing users to easily modify the pronunciation data stored in 
dictionary databases 28, as performed by pronunciation selection program 26, is 

5 described generally in Fig. 2 and in greater detail in Figs. 3 A-3B. Referring to Fig. 2, in 
accordance with the invention a character string, which may be a word, name, etc., is 
displayed on display device 18, step SO. A user uses input devices 22 to select one or 
more letters from the string, step 52. As is understood, pronunciation variations may be 
linked to individual letters such as vowels or to groups of letters such as "ou'\ "ch*\ "th" 

10 or "gh". The program 26 queries pronunciation database 30 to retrieve the sample 
words associated with the selected letter or letter group, step 54. If the letter or letter 
group is absent trom the pronunciation database 30, an error message may be sent or 
sample words for one of the letters may be retrieved. Some or all of the sample words 
are displayed, step 56, and the user selects one of the words, step 58. The program 26 

15 then generates pronunciation data for the character string using the sample word to 
provide a pronunciation of the selected letter(s), step 60. The string and pronunciation 
data are stored in the dictionary database 28, step 62, and the string may be audibly 
output by the output of the TTS module 32 or used to create a speaker verification or 
utterance for ASR module 34. 

20 The process implemented by program 26 is described in more detail in Figs, 3 A- 

3B. An exemplary embodiment of a user interface used during this process is illustrated 
in Figs. 5-9. As shown in Fig. 5, interface 190 displayed on display device 18 contains: 
an input box 200 for manual input of or display of selected characters; a test button 202 
which is inactive until a word is selected; a modify button 204 which is similarly inactive 

25 until a word is selected; a selection list 206 consisting of the choices "sound", "accent" 
and '^syllable" (or a "grouping"); and a workspace 208. 

As explained above, the system 10 preferably contains multiple dictionary and 
pronunciation databases representing diflFerent languages. Referring now to Fig. 3 A, a 
user selects one of the languages, step 70, and the program 26 opens the dictionary for 
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The module retrieve and utilize pronunciation data stored in the dictionary databases 
28. 

A method for allowing users to easily modify the pronunciation data stored in 
dictionary databases 28, as performed by pronunciation selection program 26, is 

5 described generally in Fig. 2 and in greater detail in Figs. 3 A-3B. Referring to Fig. 2, in 
accordance with the invention a character string, which may be a word, name, etc., is 
displayed on display device 18, step SO. A user uses input devices 22 to select one or 
more letters from the string, step 52. As is understood, pronunciation variations may be 
linked to individual letters such as vowels or to groups of letters such as "ou", "ch", "th" 

10 or "gh". The program 26 queries pronunciation database 30 to retrieve the sample 
words associated with the selected letter or letter group, step S4. If the letter or letter 
group is absent from the pronunciation database 30, an error message may be sent or 
sample words for one of the letters may be retrieved. Some or all of the sample words 
are displayed, step 56, and the user selects one of the words, step 58. The program 26 

15 then generates pronunciation data for the character string using the sample word to 
provide a pronunciation of the selected letter(s), step 60. The string and pronunciation 
data are stored in the dictionary database 28, step 62, and the string may be audibly 
output by the output of the TTS module 32 or used to create a speaker verification or 
utterance for ASR module 34. 

20 The process implemented by program 26 is described in more detail in Figs. 3 A- 

3B. An exemplary embodiment of a user interface used during this process is illustrated 
in Figs. 5-9. As shown in Fig. 5, interface 190 displayed on display device 1 8 contains: 
an input box 200 for manual input of or display of selected characters; a test button 202 
which is inactive until a word is selected; a modify button 204 which is similarly inactive 

25 until a word is selected; a selection list 206 consisting of the choices "sound", "accent" 
and "syllable" (or a "grouping"); and a workspace 208. 

As explained above, the system 10 preferably contains multiple dictionary and 
pronunciation databases representing different languages. Referring now to Fig. 3 A, a 
user selects one of the languages, step 70, and the program 26 opens the dictionary for 
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The modules retrieve and utilize pronunciation data stored in the dictionary databases 
28. 

A method for allowing users to easily modify the pronunciation data stored in 
dictionary databases 28. as performed by pronunciation selection program 26, is 

s described generally in Fig. 2 and in greater detail in Figs. 3 A-3B. Referring to Fig. 2, in 
accordance with the invention a character string, which may be a word, name, etc., is 
displayed on display device 18, step SO. A user uses input devices 22 to select one or 
more letters from the string, step 52. As is understood, pronunciation variations may be 
linked to individual letters such as vowels or to groups of letters such as "ou", "ch*\ "th" 

10 or *'gh'\ The program 26 queries pronunciation database 30 to retrieve the sample 
words associated with the selected letter or letter group, step S4. If the letter or letter 
group is absent from the pronunciation database 30, an error message may be sent or 
sample words for one of the letters may be retrieved. Some or all of the sample words 
are displayed, step 56, and the user selects one of the words, step 58. The program 26 

15 then generates pronunciation data for the character string using the sample word to 
provide a pronunciation of the selected letter(s), step 60. The string and pronunciation 
data are stored in the dictionary database 28, step 62, and the string may be audibly 
output by the output of the TTS module 32 or used to create a speaker verification or 
utterance for ASR module 34. 

20 The process implemented by program 26 is described in more detail in Figs. 3 A- 

3B. An exemplary embodiment of a user interface used during this process is illustrated 
in Figs. 5-9. As shown in Fig. 5, interface 190 displayed on display device 18 contains: 
an input box 200 for manual input of or display of selected characters; a test button 202 
which is inactive until a word is selected; a modify button 204 which is similarly inactive 

25 until a word is selected; a selection list 206 consisting of the choices "sound", "accent" 
and ''syllable" (or a ''grouping"); and a workspace 208. 

As explained above, the system 10 preferably contains multiple dictionary and 
pronunciation databases representing different languages. Referring now to Fig. 3 A, a 
user selects one of the languages, step 70, and the program 26 opens the dictionary for 
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the selected language, step 72. To select a word or other character string, a user can 
choose to browse the selected dictionary, step 74, in which case the user selects an 
existing word from the database 76. Otherwise, the user enters a word such as by typing 
into input box 200, step 78. 

Next, the user can choose whether to test the pronunciation of the word, step 80, 
by selecting test button 202. The process of testing a word pronunciation is described 
below with reference to Fig. 4. 

The user can choose to modify the word*s pronunciation, step 82, by selecting 
modify button 204. If not, the user can store the word and current pronunciation by 
selecting the "OK" button in dialog 190, step 84. If the word is not an existing word in 
the dictionary database 28, step 86, the word and pronunciation data are stored in the 
dictionary, step 88. As explained below with reference to Fig. 4, the pronunciation data 
for an unmodified word is generated using defauh pronunciations based on the rules of 
the selected language. If the word already exists, the new pronunciation data is stored 
with the word in the dictionary, step 90, and alternate pronunciations may be referred to 
from contextual circumstances. 

If the user wishes to modify the pronunciation, the three choices in selection list 
206 are available. 

The selected word, now appearing in input box 200, is broken into individual 
characters and copied into workspace 208. See Fig. 6. Workspace 208 further shows 
syllable breaks (the dash in workspace 208) and accent marks (the apostrophes in 
workspace 208) for the current pronunciation. 

If the user selects to modify the syllable break, step 92, a breakpoint symbol 210 
is displayed, see Fig. 7, The symbol 210 may be moved by the user to identify a desired 
syllable breakpoint, step 94. The program 26 breaks any existing syllable to two syllables 
at a selected breakpoint, step 96. 

If the user selects to modify the accent, step 98, an accent type selection icon 
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group 212 is displayed in interface 190 (see Fig. 8). The group 212 contains three icons: 
a primary accent (or major stress) icon 212a, a secondary accent (or minor stress) icon 
2 12b, and a no accent (or unstressed) icon 212c. The user selects an accent level by 
clicking one of the icons, step 100, The user then selects a syllable, step 102, by, for 
S example, selecting a box in workspace 208 immediately following the syllable. The 
program 26 identifies the selected syllable with the selected accent level, step 104, and 
may further adjust the remaining accents in accordance with rules of the selected 
language. For example, if the language provides for any one primary accented syllable, 
and the user selects a second syllable for a primary accent, the program may change the 
10 first primary accent to a secondary accent, or may delete all remaining accents entirely. 

Referring now to Fig. 3B, if the user selects to modify a letter sound in list 206, 
step 106, the user selects one or more letters in workspace 208, step 108. The program 
26 retrieves the sample words from the pronunciation database 30 for the selected 
language whose pronunciations, or ponions thereof, are associated or linked with the 

15 selected letter(s), step 1 10. The words are displayed in word list 214, see Fig. 9. The 
sample word which represents the default pronunciation for the selected letter(s) is 
highlighted, step 1 12. See Fig. 9, in which the sample word "buy" is highlighted in word 
list 2 14 for pronunciation of the selected letter ''i'\ A user can also listen to the 
pronunciations of the sample words. As also shown in Fig, 9, only two or three of the 

20 sample words may be shown in word list 214, with an option for the user to see and hear 
additional words. 

If the user selects one of the sample words, step 1 14, the pronunciation data, or 
portions thereof, for the selected word is associated with the letter(s) selected in the 
selected word contained in workspace 208, step 1 16. The modified word may then be 
25 modified further or stored, in accordance with the process described above, or may be 
tested as described below. 

In accordance with certain aspects of the present invention, it is recognized that 
most languages including English contain words taken from other languages. Therefore, 
the user is given (e.g., in word list 214 after selecting "more*') the option of selecting a 
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pronundation for the selected letters from another language, step 1 18. The user then 
selects the desired language, step 120, and the program 26 retrieves sample words 
associated with the selected letter(s) from the pronunciation database file 30 for that 
selected language, step 122. The san^le words are then presented for the user's 
selection as explained above. 

As a result, a simple and flexible process is achieved for allowing users to modify 
word pronunciations. As one example of the ease and flexibility of the process, the word 
'^michael" selected in Fig. 9 may be modified from the English pronunciation, "'NfiK'-er* 
to a Hebrew name "Mee-cha'-eF" by adding a syllable break between the '"a" and "e" and 

and ''ch'*, placing the primary accent on the new syllable *'cha,*' and selecting 
appropriate pronunciations for the "i", ''ch" (e.g., from the Hebrew language dictionary), 
"a" and "e" based on common words. No grammatical or linguistic expertise is required. 

The process of testing a word's pronunciation is shown in Fig. 4. If the word 
already is contained in the dictionary database 28, step 140, the stored pronunciation is 
retrieved, step 142. If more than one pronunciation exists for the word, the user may be 
prompted to select one, or a default used. If the word is not yet present, then for each 
letter or letter group, if a user has selected a pronunciation using the program 26, step 
144, that pronunciation data is retrieved, step 146, and otherwise a defauh pronunciation 
may be selected, step 148. When all letters have been reviewed, step ISO, the program 
26 generates a pronimciation for the word using the retrieved letter pronunciations, step 
152. Finally the TTS module outputs an audible representation of the retrieved or 
generated word pronunciation, step 154. 

Because the system described herein allows for multiple pronundations for a 
single word, the TTS module must identify which pronunciation is intended for the word. 
The TTS module can identify the pronunciation based on the context in which the word 
is used. For example, the pronunciations may be associated with objects such as users 
on a network, such that a message intended for a specific user would result in a correct 
selection of pronunciations. As another example, the TTS module may identify a word 
usage as noun vs. verb, and select the appropriate pronunciation accordingly. 
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While the invention has been described and illustrated in connection with 
preferred embodiments, many variations and modifications as will be evident to those 
skilled in this art may be made without departing from the spirit and scope of the 
invention, and the invention is thus not to be limited to the precise details of 
methodology or construction set forth above as such variations and modification are 
intended to be included within the scope of the invention. 
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Claims; 

1 . A method implemented on a computer for allowing a user to set a 
pronunciation of a string of characters, the method comprising: 

allowing the user to select one or more characters in the string; 

retrieving from a database accessible by the computer a plurality of 
samples of words or parts of words representing possible pronunciations of the selected 
one or more characters and displaying the retrieved samples; 

allowing the user to sdect one of the displayed samples; and 

storing a first pronunciation record comprising the string of characters 
with the selected one or more characters being assigned the pronunciation associated 
with the sample selected by the user. 

2. The method of claim 1, comprising generating a pronunciation of the 
character string using the pronunciation represented by the sample selected by the user as 
the pronunciation for the selected one or more characters, and audibly outputting the 
generated pronunciation. 

3. The method of claim 2, comprising allowing the user to select another of the 
displayed samples after audibly outputting the generated pronunciation. 

4. The method of claim 1, comprising allowing the user to select a second of the 
displayed samples and storing a second pronunciation record comprising the string of 
characters with the selected one or more characters being assigned the pronimciation 
represented by the second sample selected by the user. 

5. The method of claim 4, comprising, during a text-to-speech process of 
generating audible output of a text file containing the string of characters, selecting one 
of the first and second pronunciation records. 
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6. The method of claim S, comprising associating the first and second 
pronunciation files with first and second objects, respectively, and selecting one of the 
first and second objects, and wherein the step of selecting one of the first and second 
pronunciation records comprises selected the pronunciation record associated with the 
selected object. 

7. The method of claim 4, comprising, during a speech recognition process, 
recognizing a pronunciation of the string of characters by a user and selecting one of the 
first and second pronunciation records which most closely matches the recognized 
pronunciation. 

8. The method of claim 7, comprising associating the first and second 
pronunciation files with first and second objects, respectively, and selecting one of the 
first and second objects which is associated v^th the selected pronunciation record. 

9. The method of claim 1, comprising allowing the user to identify a part of the 
character string as a separate syllable, and wherein the step of storing the first 
pronunciation record comprises storing data representing the identified separate syllable. 

10. The method of claim 1, comprising allowing the user to identify a part of the 
character string to associate with an accent, and wherein the step of storing the first 
pronunciation record comprises storing data representing the identified accent. 

1 1 . The method of claim 1, comprising receiving the character string as input by 
the user 

12. The method of claim 1, comprising allowing the user to select the character 
string from a dictionary database accessible to the computer. 

13. The method of claim 1, comprising allowing the user to select a preferred 
language and wherein the step of retrieving the samples representing possible 
pronunciations of the selected one or more characters comprises selecting a database for 
the preferred language fi-om a plurality of language databases and retrieving the samples 
fiom the selected database. 
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14. The method of claim 1, comprising allowing the user to select a second 
language for the selected one or more characters and retrieving additional word samples 
from a second database corresponding to the selected second language. 

15. An article of manufacture comprising a computer readable medium storing 
s program code for, when executed, causing a computer to perform a graphical user 

interface method for allowing a user to set a pronunciation of a string of characters, the 
method comprising: 

allowing the user to select one or more characters in the string; 

retrieving from a database accessible by the computer a plurality of 
10 samples of words or parts of words representing possible pronunciations of the selected 
one or more characters and displaying the retrieved samples; 

allowing the user to select one of the displayed samples; and 

storing a first pronunciation record comprising the string of characters 
with the selected one or more characters being assigned the pronunciation associated 
IS with the sample selected by the user. 

16. The article of claim 15, wherein the method the program code causes the 
computer to perform comprises generating a pronunciation of the character string using 
the pronunciation represented by the sample selected by the user as the pronunciation for 
the selected one or more characters, and audibly outputting the generated pronunciation. 

20 17. The method of claim 16, wherein the method the program code causes the 

computer to perform comprises allowing the user to select another of the displayed 
samples after audibly outputting the generated pronunciation. 



25 



18. The method of claim 1 S, wherein the method the program code causes the 
computer to perform comprises allowing the user to select a second of the displayed 
samples and storing a second pronunciation record comprising the string of characters 
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with the selected one or more characters being assigned the pronunciation represented by 
the second sample selected by the user. 

19. The method of claim 18, wherein the method the program code causes the 
computer to perform comprises, during a text-to-speech process of generating audible 
output of a text file containing the string of characters, selecting one of the first and 
second pronunciation records. 

20. The method of claim 19, wherein the method the program code causes the 
computer to perform comprises associating the first and second pronunciation files with 
first and second objects, respectively, and selecting one of the first and second objects, 
and wherein the step of selecting one of the first and second pronunciation records 
comprises selected the pronunciation record associated with the selected object. 

21 . The method of claim 18, wherein the method the program code causes the 
computer to perform comprises, during a speech recognition process, recognizing a 
pronunciation of the string of characters by a user and selecting one of the first and 
second pronunciation records which most closely matches the recognized pronunciation. 

22. The method of claim 21, wherein the method the program code causes the 
computer to perform comprises associating the first and second pronunciation files with 
first and second objects, respectively, and selecting one of the first and second objects 
which is associated with the selected pronunciation record. 

23. A graphical user interface system for allowing a user to modify a 
pronunciation of a striog of characters, the system comprising: 

a dictionary database stored on a memory device comprising a plurality of 
first character strings and associated pronunciation records; 

a pronunciation database stored on a memory device comprising a 
plurality of second character strings each comprising one or more characters and each 
associated with a plurality of words, each word having one or more characters which are 
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pronoimced in the word in substantially identical fashion to one manner in which the 
associated second character string may be pronounced; 

an input/output system for allowing a user to select one of the first 
character strings from the dictionary database, to select one or more characters from the 
selected string, and to select one of the words in the pronimciation database; and 

a programmable controller for generating a pronunciation record 
comprising the selected first character string with the selected one or more characters 
being assigned the pronunciation associated with the word sample selected by the user. 
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